190 research outputs found
Penalized maximum likelihood estimation and variable selection in geostatistics
We consider the problem of selecting covariates in spatial linear models with
Gaussian process errors. Penalized maximum likelihood estimation (PMLE) that
enables simultaneous variable selection and parameter estimation is developed
and, for ease of computation, PMLE is approximated by one-step sparse
estimation (OSE). To further improve computational efficiency, particularly
with large sample sizes, we propose penalized maximum covariance-tapered
likelihood estimation (PMLE) and its one-step sparse estimation
(OSE). General forms of penalty functions with an emphasis on
smoothly clipped absolute deviation are used for penalized maximum likelihood.
Theoretical properties of PMLE and OSE, as well as their approximations
PMLE and OSE using covariance tapering, are
derived, including consistency, sparsity, asymptotic normality and the oracle
properties. For covariance tapering, a by-product of our theoretical results is
consistency and asymptotic normality of maximum covariance-tapered likelihood
estimates. Finite-sample properties of the proposed methods are demonstrated in
a simulation study and, for illustration, the methods are applied to analyze
two real data sets.Comment: Published in at http://dx.doi.org/10.1214/11-AOS919 the Annals of
Statistics (http://www.imstat.org/aos/) by the Institute of Mathematical
Statistics (http://www.imstat.org
Can Single-Pass Contrastive Learning Work for Both Homophilic and Heterophilic Graph?
Existing graph contrastive learning (GCL) typically requires two forward pass
for a single instance to construct the contrastive loss. Despite its remarkable
success, it is unclear whether such a dual-pass design is (theoretically)
necessary. Besides, the empirical results are hitherto limited to the
homophilic graph benchmarks. Then a natural question arises: Can we design a
method that works for both homophilic and heterophilic graphs with a
performance guarantee? To answer this, we analyze the concentration property of
features obtained by neighborhood aggregation on both homophilic and
heterophilic graphs, introduce the single-pass graph contrastive learning loss
based on the property, and provide performance guarantees of the minimizer of
the loss on downstream tasks. As a direct consequence of our analysis, we
implement the Single-Pass Graph Contrastive Learning method (SP-GCL).
Empirically, on 14 benchmark datasets with varying degrees of heterophily, the
features learned by the SP-GCL can match or outperform existing strong
baselines with significantly less computational overhead, which verifies the
usefulness of our findings in real-world cases.Comment: 20 pages, 6 figures, 9 tables. arXiv admin note: substantial text
overlap with arXiv:2204.0487
Identifying Solar Flare Precursors Using Time Series of SDO/HMI Images and SHARP Parameters
We present several methods towards construction of precursors, which show
great promise towards early predictions, of solar flare events in this paper. A
data pre-processing pipeline is built to extract useful data from multiple
sources, Geostationary Operational Environmental Satellites (GOES) and Solar
Dynamics Observatory (SDO)/Helioseismic and Magnetic Imager (HMI), to prepare
inputs for machine learning algorithms. Two classification models are
presented: classification of flares from quiet times for active regions and
classification of strong versus weak flare events. We adopt deep learning
algorithms to capture both the spatial and temporal information from HMI
magnetogram data. Effective feature extraction and feature selection with raw
magnetogram data using deep learning and statistical algorithms enable us to
train classification models to achieve almost as good performance as using
active region parameters provided in HMI/Space-Weather HMI-Active Region Patch
(SHARP) data files. Case studies show a significant increase in the prediction
score around 20 hours before strong solar flare events
A Hierarchical Bayesian Approach to Neutron Spectrum Unfolding with Organic Scintillators
We propose a hierarchical Bayesian model and state-of-art Monte Carlo
sampling method to solve the unfolding problem, i.e., to estimate the spectrum
of an unknown neutron source from the data detected by an organic scintillator.
Inferring neutron spectra is important for several applications, including
nonproliferation and nuclear security, as it allows the discrimination of
fission sources in special nuclear material (SNM) from other types of neutron
sources based on the differences of the emitted neutron spectra. Organic
scintillators interact with neutrons mostly via elastic scattering on hydrogen
nuclei and therefore partially retain neutron energy information. Consequently,
the neutron spectrum can be derived through deconvolution of the measured light
output spectrum and the response functions of the scintillator to monoenergetic
neutrons. The proposed approach is compared to three existing methods using
simulated data to enable controlled benchmarks. We consider three sets of
detector responses. One set corresponds to a 2.5 MeV monoenergetic neutron
source and two sets are associated with (energy-wise) continuous neutron
sources (Cf and AmBe). Our results show that the proposed
method has similar or better unfolding performance compared to other iterative
or Tikhonov regularization-based approaches in terms of accuracy and robustness
against limited detection events, while requiring less user supervision. The
proposed method also provides a posteriori confidence measures, which offers
additional information regarding the uncertainty of the measurements and the
extracted information.Comment: 10 page
A Graphical Model for Fusing Diverse Microbiome Data
This paper develops a Bayesian graphical model for fusing disparate types of
count data. The motivating application is the study of bacterial communities
from diverse high dimensional features, in this case transcripts, collected
from different treatments. In such datasets, there are no explicit
correspondences between the communities and each correspond to different
factors, making data fusion challenging. We introduce a flexible
multinomial-Gaussian generative model for jointly modeling such count data.
This latent variable model jointly characterizes the observed data through a
common multivariate Gaussian latent space that parameterizes the set of
multinomial probabilities of the transcriptome counts. The covariance matrix of
the latent variables induces a covariance matrix of co-dependencies between all
the transcripts, effectively fusing multiple data sources. We present a
computationally scalable variational Expectation-Maximization (EM) algorithm
for inferring the latent variables and the parameters of the model. The
inferred latent variables provide a common dimensionality reduction for
visualizing the data and the inferred parameters provide a predictive posterior
distribution. In addition to simulation studies that demonstrate the
variational EM procedure, we apply our model to a bacterial microbiome dataset
A review on N-doped biochar for oxidative degradation of organic contaminants in wastewater by persulfate activation
The Persulfate-based advanced oxidation process is the most efficient and commonly used technology to remove organic contaminants in wastewater. Due to the large surface area, unique electronic properties, abundant N functional groups, cost-effectiveness, and environmental friendliness, N-doped biochars (NBCs) are widely used as catalysts for persulfate activation. This review focuses on the NBC for oxidative degradation of organics-contaminated wastewater. Firstly, the preparation and modification methods of NBCs were reviewed. Then the catalytic performance of NBCs and modified NBCs on the oxidation degradation of organic contaminants were discussed with an emphasis on the degradation mechanism. We further summarized the detection technologies of activation mechanisms and the structures of NBCs affecting the PS activation, followed by the specific role of the N configuration of the NBC on its catalytic capacity. Finally, several challenges in the treatment of organics-contaminated wastewater by a persulfate-based advanced oxidation process were put forward and the recommendations for future research were proposed for further understanding of the advanced oxidation process activated by the NBC
- …